Concatenative Resynthesis Using Twin Networks
نویسندگان
چکیده
Traditional noise reduction systems modify a noisy signal to make it more like the original clean signal. For speech, these methods suffer from two main problems: under-suppression of noise and over-suppression of target speech. Instead, synthesizing clean speech based on the noisy signal could produce outputs that are both noise-free and high quality. Our previous work introduced such a system using concatenative synthesis, but it required processing the clean speech at run time, which was slow and not scalable. In order to make such a system scalable, we propose here learning a similarity metric using two separate networks, one network processing the clean segments offline and another processing the noisy segments at run time. This system incorporates a ranking loss to optimize for the retrieval of appropriate clean speech segments. This model is compared against our original on the CHiME2-GRID corpus, measuring ranking performance and subjective listening tests of resyntheses.
منابع مشابه
Autonomous Generation of Soundscapes using Unstructured Sound Databases
This research focuses on the generation of soundscapes using unstructured sound databases for the sonification of virtual environments. A generalized methodology for design based on soundscape categorization, perceptual discrimination of sources and media design principles is proposed, with the underlying principle of the composition of a source and a textural layer within any soundscape. A gen...
متن کاملA System for Data-driven Concatenative Sound Synthesis
In speech synthesis, concatenative data-driven synthesis methods prevail. They use a database of recorded speech and a unit selection algorithm that selects the segments that match best the utterance to be synthesized. Transferring these ideas to musical sound synthesis allows a new method of high quality sound synthesis. Usual synthesis methods are based on a model of the sound signal. It is v...
متن کاملRule-based Emotion Synthesis Using Concatenated Speech
Concatenative speech synthesis is increasing in popularity, as it offers higher quality output than previous formant synthesisers. However, it is based on recorded speech units, concatenative synthesis offers a lesser degree of parametric control during resynthesis. Consequently, adding pragmatic effects such as different speaking styles and emotions at the synthesis stage is fundamentally more...
متن کاملA compositional approach to analysis re-synthesis
This report summarises the work achieved in the framework of a composer in research project in the IMTR team at IRCAM. In this project we have experimented with different realtime processing techniques for the analysis, transformation, and resynthesis of particular morphological aspects of live performed voice and instrumental sounds. Based on segment-wise sound descriptions, the developed tool...
متن کاملSoundscape Generation for Virtual Environments using Community-Provided Audio Databases
This research focuses on the generation of soundscapes using unstructured sound databases for the sonification of virtual environments. The design methodology incorporates the use of concatenative synthesis to construct a sound environment using online community-provided sonic material, and an application of this methodology is described in which sound environments are generated for Google Stre...
متن کامل